{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Handwritten digits classification using TensorFlow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Putting all the concepts we have learned so far, we will see how can use tensorflow to\n",
    "build a neural network to recognize handwritten digits. If you are playing around deep\n",
    "learning off late then you must have come across MNIST dataset. It is being called the hello\n",
    "world of deep learning. \n",
    "\n",
    "It consists of 55,000 data points of handwritten digits (0 to 9).\n",
    "In this section, we will see how can we use our neural network to recognize the\n",
    "handwritten digits and also we will get hang of tensorflow and tensorboard.\n",
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import required libraries\n",
    "\n",
    "As a first step, let us import all the required libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings('ignore')\n",
    "\n",
    "\n",
    "import tensorflow as tf\n",
    "from tensorflow.examples.tutorials.mnist import input_data\n",
    "tf.logging.set_verbosity(tf.logging.ERROR)\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.13.1\n"
     ]
    }
   ],
   "source": [
    "print tf.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load the Dataset\n",
    "\n",
    "In the below code, \"data/mnist\" implies the location where we store the MNIST dataset.\n",
    "one_hot=True implies we are one-hot encoding the labels (0 to 9):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.\n",
      "Extracting data/mnist/train-images-idx3-ubyte.gz\n",
      "Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.\n",
      "Extracting data/mnist/train-labels-idx1-ubyte.gz\n",
      "Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.\n",
      "Extracting data/mnist/t10k-images-idx3-ubyte.gz\n",
      "Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.\n",
      "Extracting data/mnist/t10k-labels-idx1-ubyte.gz\n"
     ]
    }
   ],
   "source": [
    "mnist = input_data.read_data_sets(\"data/mnist\", one_hot=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check what we got in our data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "No of images in training set (55000, 784)\n",
      "No of labels in training set (55000, 10)\n",
      "No of images in test set (10000, 784)\n",
      "No of labels in test set (10000, 10)\n"
     ]
    }
   ],
   "source": [
    "print(\"No of images in training set {}\".format(mnist.train.images.shape))\n",
    "print(\"No of labels in training set {}\".format(mnist.train.labels.shape))\n",
    "\n",
    "print(\"No of images in test set {}\".format(mnist.test.images.shape))\n",
    "print(\"No of labels in test set {}\".format(mnist.test.labels.shape))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have 55,000 images in the training set and each image is of size 784 and we have 10 labels which are actually 0 to 9. Similarly, we have 10000 images in the test set.\n",
    "\n",
    "Now we plot one image to see how it looks like:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.image.AxesImage at 0x7f7bfa160bd0>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvhp/UCwAADetJREFUeJzt3X+o1XWex/HXO2fsh4roertJo3snuSxUtI4cLBtZZmlnamLAJqImQQxCIybYoRGmHGGjP+KyrA1Cy5CzyWi4OUsqSsSuJUsmbIMnszJt14o7qPnjasFk/uF65z1/3K/Dre73c07n+z3ne+59Px9wued8398fb7718nvO+Zz7/Zi7C0A8l1XdAIBqEH4gKMIPBEX4gaAIPxAU4QeCIvxAUIQfCIrwA0F9o5MHmzVrlvf19XXykEAog4ODOnPmjDWzbqHwm9kdktZJmiTp39x9ILV+X1+f6vV6kUMCSKjVak2v2/LLfjObJOlfJf1Q0vWS7jez61vdH4DOKvKef6GkD9z9I3e/IGmLpCXltAWg3YqE/1pJR0c9P5Yt+wIzW2lmdTOrDw0NFTgcgDK1/dN+d1/v7jV3r/X09LT7cACaVCT8xyXNGfX8W9kyAONAkfDvk9RvZt82s8mSfiJpZzltAWi3lof63P2imT0i6b80MtS3wd3fK60zAG1VaJzf3V+W9HJJvQDoIL7eCwRF+IGgCD8QFOEHgiL8QFCEHwiK8ANBEX4gKMIPBEX4gaAIPxAU4QeCIvxAUIQfCIrwA0ERfiAowg8ERfiBoAg/EBThB4Ii/EBQhB8IivADQRF+ICjCDwRF+IGgCD8QFOEHgiL8QFCFZuk1s0FJn0kalnTR3WtlNAWg/QqFP/P37n6mhP0A6CBe9gNBFQ2/S9plZm+a2coyGgLQGUVf9i929+NmdrWkV8zsfXffM3qF7B+FlZI0d+7cgocDUJZCV353P579Pi1pu6SFY6yz3t1r7l7r6ekpcjgAJWo5/GY2xcymXXos6QeSDpbVGID2KvKyv1fSdjO7tJ9/d/f/LKUrAG3Xcvjd/SNJf1tiLwA6iKE+ICjCDwRF+IGgCD8QFOEHgiL8QFBl/FUfKvbqq6/m1rLvYeSaMWNGsn7wYPp7W4sWLUrW+/v7k3VUhys/EBThB4Ii/EBQhB8IivADQRF+ICjCDwQ1Ycb59+zZk6y/8cYbyfratWvLbKejzp492/K2kyZNStYvXLiQrF911VXJ+tSpU3NrixcvTm77/PPPFzo20rjyA0ERfiAowg8ERfiBoAg/EBThB4Ii/EBQ42qcf2BgILe2Zs2a5LbDw8NltzMhFD0v58+fb7m+bdu25LaN7kWwcePGZH3KlCnJenRc+YGgCD8QFOEHgiL8QFCEHwiK8ANBEX4gqIbj/Ga2QdKPJJ129xuzZTMl/U5Sn6RBSfe6+6fta3PEs88+m1trNF59yy23JOvTpk1rqacy3Hbbbcn63Xff3aFOvr5du3Yl6+vWrcutHTlyJLnt1q1bW+rpkk2bNuXWuBdAc1f+30q640vLHpO02937Je3OngMYRxqG3933SPrkS4uXSLr09aqNku4quS8Abdbqe/5edz+RPT4pqbekfgB0SOEP/NzdJXle3cxWmlndzOpDQ0NFDwegJK2G/5SZzZak7PfpvBXdfb2719y91tPT0+LhAJSt1fDvlLQ8e7xc0o5y2gHQKQ3Db2YvSPofSX9jZsfM7EFJA5K+b2ZHJP1D9hzAOGIjb9k7o1areb1eb3n7M2fO5NY+/PDD5Lbz589P1i+//PKWekLap5/mf/2j0fcb3nrrrULH3rx5c25t6dKlhfbdrWq1mur1evpGCBm+4QcERfiBoAg/EBThB4Ii/EBQhB8IalwN9WFiaTRt+qJFiwrtv7c3/09OTp48WWjf3YqhPgANEX4gKMIPBEX4gaAIPxAU4QeCIvxAUIQfCIrwA0ERfiAowg8ERfiBoAg/EBThB4Ii/EBQDafoBorYsSN/Ppe9e/e29diff/55bu3o0aPJbefMmVN2O12HKz8QFOEHgiL8QFCEHwiK8ANBEX4gKMIPBNVwnN/MNkj6kaTT7n5jtuwJSSskDWWrrXb3l9vVJNLOnTuXW9u+fXty2zVr1pTdzhekxtPbPWdE6rzcdNNNyW1TU4tPFM1c+X8r6Y4xlv/K3ednPwQfGGcaht/d90j6pAO9AOigIu/5HzGzd8xsg5nNKK0jAB3Ravh/LWmepPmSTkham7eima00s7qZ1YeGhvJWA9BhLYXf3U+5+7C7/0nSbyQtTKy73t1r7l7r6elptU8AJWsp/GY2e9TTH0s6WE47ADqlmaG+FyR9T9IsMzsm6Z8kfc/M5ktySYOSHmpjjwDaoGH43f3+MRY/14Zewjp06FCyvm/fvmR9YGAgt/b++++31NNEt2rVqqpbqBzf8AOCIvxAUIQfCIrwA0ERfiAowg8Exa27S3D27Nlk/eGHH07WX3zxxWS9nX/6Om/evGT9mmuuKbT/Z555Jrc2efLk5LZLly5N1t9+++2WepKkuXPntrztRMGVHwiK8ANBEX4gKMIPBEX4gaAIPxAU4QeCYpy/SVu2bMmtPfnkk8ltDx8+nKxPmzYtWZ85c2ay/tRTT+XWGk013egW1tOnT0/W26nonZ9Svd9+++2F9j0RcOUHgiL8QFCEHwiK8ANBEX4gKMIPBEX4gaAY52/Sa6+9lltrNI7/wAMPJOurV69O1vv7+5P18er48ePJeqNbmjdyxRVX5NauvvrqQvueCLjyA0ERfiAowg8ERfiBoAg/EBThB4Ii/EBQDcf5zWyOpE2SeiW5pPXuvs7MZkr6naQ+SYOS7nX3T9vXarWefvrp3NqCBQuS265YsaLsdiaEo0ePJusff/xxof3fc889hbaf6Jq58l+U9HN3v17SLZJ+ambXS3pM0m5375e0O3sOYJxoGH53P+Hu+7PHn0k6LOlaSUskbcxW2yjprnY1CaB8X+s9v5n1SfqOpN9L6nX3E1nppEbeFgAYJ5oOv5lNlbRV0s/c/Y+jaz4ymdyYE8qZ2Uozq5tZfWhoqFCzAMrTVPjN7JsaCf5md9+WLT5lZrOz+mxJp8fa1t3Xu3vN3WtFb8gIoDwNw29mJuk5SYfdffRH3jslLc8eL5e0o/z2ALRLM3/S+11JyyS9a2YHsmWrJQ1I+g8ze1DSHyTd254Wu8OVV16ZW2MorzWpP5NuRqNbmj/66KOF9j/RNQy/u++VZDnl28ptB0Cn8A0/ICjCDwRF+IGgCD8QFOEHgiL8QFDcuhttdfPNN+fW9u/fX2jf9913X7J+3XXXFdr/RMeVHwiK8ANBEX4gKMIPBEX4gaAIPxAU4QeCYpwfbZWavvzixYvJbWfMmJGsr1q1qqWeMIIrPxAU4QeCIvxAUIQfCIrwA0ERfiAowg8ExTg/Cnn99deT9fPnz+fWpk+fntz2pZdeStb5e/1iuPIDQRF+ICjCDwRF+IGgCD8QFOEHgiL8QFANx/nNbI6kTZJ6Jbmk9e6+zsyekLRC0lC26mp3f7ldjaIaw8PDyfrjjz+erE+ePDm3tmLFiuS2t956a7KOYpr5ks9FST939/1mNk3Sm2b2Slb7lbv/S/vaA9AuDcPv7ickncgef2ZmhyVd2+7GALTX13rPb2Z9kr4j6ffZokfM7B0z22BmY95zycxWmlndzOpDQ0NjrQKgAk2H38ymStoq6Wfu/kdJv5Y0T9J8jbwyWDvWdu6+3t1r7l7r6ekpoWUAZWgq/Gb2TY0Ef7O7b5Mkdz/l7sPu/idJv5G0sH1tAihbw/CbmUl6TtJhd3961PLZo1b7saSD5bcHoF2a+bT/u5KWSXrXzA5ky1ZLut/M5mtk+G9Q0kNt6RCVGvm3P99DD6X/sy9YsCC3dsMNN7TUE8rRzKf9eyWN9X8AY/rAOMY3/ICgCD8QFOEHgiL8QFCEHwiK8ANBcetuJF12Wfr6sGzZsg51grJx5QeCIvxAUIQfCIrwA0ERfiAowg8ERfiBoMzdO3cwsyFJfxi1aJakMx1r4Ovp1t66tS+J3lpVZm9/7e5N3S+vo+H/ysHN6u5eq6yBhG7trVv7kuitVVX1xst+ICjCDwRVdfjXV3z8lG7trVv7kuitVZX0Vul7fgDVqfrKD6AilYTfzO4ws/81sw/M7LEqeshjZoNm9q6ZHTCzesW9bDCz02Z2cNSymWb2ipkdyX6POU1aRb09YWbHs3N3wMzurKi3OWb232Z2yMzeM7N/zJZXeu4SfVVy3jr+st/MJkn6P0nfl3RM0j5J97v7oY42ksPMBiXV3L3yMWEz+ztJ5yRtcvcbs2X/LOkTdx/I/uGc4e6/6JLenpB0ruqZm7MJZWaPnlla0l2SHlCF5y7R172q4LxVceVfKOkDd//I3S9I2iJpSQV9dD133yPpky8tXiJpY/Z4o0b+5+m4nN66grufcPf92ePPJF2aWbrSc5foqxJVhP9aSUdHPT+m7pry2yXtMrM3zWxl1c2MoTebNl2STkrqrbKZMTScubmTvjSzdNecu1ZmvC4bH/h91WJ3XyDph5J+mr287Uo+8p6tm4Zrmpq5uVPGmFn6L6o8d63OeF22KsJ/XNKcUc+/lS3rCu5+PPt9WtJ2dd/sw6cuTZKa/T5dcT9/0U0zN481s7S64Nx104zXVYR/n6R+M/u2mU2W9BNJOyvo4yvMbEr2QYzMbIqkH6j7Zh/eKWl59ni5pB0V9vIF3TJzc97M0qr43HXdjNfu3vEfSXdq5BP/DyX9sooecvq6TtLb2c97Vfcm6QWNvAz8f418NvKgpL+StFvSEUmvSprZRb09L+ldSe9oJGizK+ptsUZe0r8j6UD2c2fV5y7RVyXnjW/4AUHxgR8QFOEHgiL8QFCEHwiK8ANBEX4gKMIPBEX4gaD+DP+BRwSusE7dAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "img1 = mnist.train.images[0].reshape(28,28)\n",
    "plt.imshow(img1, cmap='Greys')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define the number of neurons in each layer\n",
    "\n",
    "We build a 4 layer neural network with 3 hidden layers and 1 output layer. As the size of\n",
    "the input image is 784. We set the num_input to 784 and since we have 10 handwritten\n",
    "digits (0 to 9), We set 10 neurons in the output layer. We define the number of neurons in\n",
    "each layer as follows,"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "#number of neurons in input layer\n",
    "num_input = 784  \n",
    "\n",
    "#number of neurons in hidden layer 1\n",
    "num_hidden1 = 512  \n",
    "\n",
    "#number of neurons in hidden layer 2\n",
    "num_hidden2 = 256  \n",
    "\n",
    "#number of neurons in hidden layer 3\n",
    "num_hidden_3 = 128  \n",
    "\n",
    "#number of neurons in output layer\n",
    "num_output = 10  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Defining placeholders\n",
    "\n",
    "As we learned, we first need to define the placeholders for input and output. Values for\n",
    "the placeholders will be feed at the run time through feed_dict:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope('input'):\n",
    "    X = tf.placeholder(\"float\", [None, num_input])\n",
    "\n",
    "with tf.name_scope('output'):\n",
    "    Y = tf.placeholder(\"float\", [None, num_output])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we have a 4 layer network, we have 4 weights and 4 baises. We initialize our weights\n",
    "by drawing values from the truncated normal distribution with a standard deviation of\n",
    "0.1. \n",
    "\n",
    "Remember, the dimensions of the weights matrix should be a number of neurons in the\n",
    "previous layer x number of neurons in the current layer. For instance, the dimension of\n",
    "weight matrix w3 should be the number of neurons in the hidden layer 2 x number of\n",
    "neurons in hidden layer 3.\n",
    "\n",
    "We often define all the weights in a dictionary as given below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope('weights'):\n",
    "    \n",
    "        weights = {\n",
    "        'w1': tf.Variable(tf.truncated_normal([num_input, num_hidden1], stddev=0.1),name='weight_1'),\n",
    "        'w2': tf.Variable(tf.truncated_normal([num_hidden1, num_hidden2], stddev=0.1),name='weight_2'),\n",
    "        'w3': tf.Variable(tf.truncated_normal([num_hidden2, num_hidden_3], stddev=0.1),name='weight_3'),\n",
    "        'out': tf.Variable(tf.truncated_normal([num_hidden_3, num_output], stddev=0.1),name='weight_4'),\n",
    "    }\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The dimension of bias should be a number of neurons in the current layer. For instance, the\n",
    "dimension of bias b2 is the number of neurons in the hidden layer 2. We set the bias value\n",
    "as constant 0.1 in all layers:\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope('biases'):\n",
    "\n",
    "    biases = {\n",
    "        'b1': tf.Variable(tf.constant(0.1, shape=[num_hidden1]),name='bias_1'),\n",
    "        'b2': tf.Variable(tf.constant(0.1, shape=[num_hidden2]),name='bias_2'),\n",
    "        'b3': tf.Variable(tf.constant(0.1, shape=[num_hidden_3]),name='bias_3'),\n",
    "        'out': tf.Variable(tf.constant(0.1, shape=[num_output]),name='bias_4')\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Forward Propagation\n",
    "\n",
    "Now, we define the forward propagation operation. We use relu activations in all layers\n",
    "and in the last layer we use sigmoid activation as defined below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope('Model'):\n",
    "    \n",
    "    with tf.name_scope('layer1'):\n",
    "        layer_1 = tf.nn.relu(tf.add(tf.matmul(X, weights['w1']), biases['b1']) )   \n",
    "    \n",
    "    with tf.name_scope('layer2'):\n",
    "        layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1, weights['w2']), biases['b2']))\n",
    "        \n",
    "    with tf.name_scope('layer3'):\n",
    "        layer_3 = tf.nn.relu(tf.add(tf.matmul(layer_2, weights['w3']), biases['b3']))\n",
    "        \n",
    "    with tf.name_scope('output_layer'):\n",
    "         y_hat = tf.nn.sigmoid(tf.matmul(layer_3, weights['out']) + biases['out'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compute Loss and Backpropagate\n",
    "\n",
    "\n",
    "\n",
    "Next, we define our loss function. We use softmax cross-entropy as our loss\n",
    "function. Tensorflow\n",
    "provides tf.nn.softmax_cross_entropy_with_logits() function for computing the\n",
    "softmax cross entropy loss. It takes the two parameters as inputs logits and labels.\n",
    "\n",
    "* logits implies the logits predicted by our network. That is, y_hat\n",
    "\n",
    "* labels imply the actual labels. That is, true labels y\n",
    "\n",
    "\n",
    "We take mean of the loss using tf.reduce_mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope('Loss'):\n",
    "        loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_hat,labels=Y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we need to minimize the loss using backpropagation. Don't worry! We don't have to\n",
    "calculate derivatives of all the weights manually. Instead, we can use tensorflow's\n",
    "optimizer. In this section, we use Adam optimizer. It is a variant of gradient descent\n",
    "optimization technique we learned in the previous chapter. In the next chapter, we will\n",
    "dive into detail and see how exactly all the Adam and several other optimizers work. For\n",
    "now, let's say we use Adam optimizer as our backpropagation algorithm,\n",
    "\n",
    "\n",
    "tf.train.AdamOptimizer() requires the learning rate as input. So we set 1e-4 as the learning rate and we minimize the loss with minimize() function. It computes the gradients and updates the parameters (weights and biases) of our network."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "optimizer = tf.train.AdamOptimizer(1e-4).minimize(loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Compute Accuracy\n",
    "\n",
    "We calculate the accuracy of our model as follows. \n",
    "\n",
    "\n",
    "* y_hat denotes the predicted probability for each class by our model. Since we have 10 classes we will have 10 probabilities. If the probability is high at position 7, then it means that our network predicts the input image as digit 7 with high probability.  tf.argmax() returns the index of the largest value. Thus, tf.argmax(y_hat,1) gives the index where the probability is high. Thus, if the probability is high at index 7, then it returns 7\n",
    "<br>\n",
    "\n",
    "\n",
    "* Y denotes the actual labels and it is the one hot encoded values. That is, it consists of zeros everywhere except at the position of the actual image where it consists of 1. For instance, if the input image is 7, then Y has 0 at all indices except at index 7 where it has 1. Thus, tf.argmax(Y,1) returns 7 because that is where we have high value i.e 1. \n",
    "\n",
    "\n",
    "Thus, tf.armax(y_hat,1) gives the predicted digit and tf.argmax(Y,1) gives us the actual digit. \n",
    "\n",
    "tf.equal(x, y) takes x and y as inputs and returns the truth value of (x == y) element-wise. Thus, correct_pred = tf.equal(predicted_digit,actual_digit) consists of True where the actual and predicted digits are same and False where the actual and predicted digits are not the same. We convert the boolean values in correct_pred into float using tensorflow's cast operation. That is, tf.cast(correct_pred, tf.float32). After converting into float values, take the average using tf.treduce_mean().\n",
    "\n",
    "Thus, tf.reduce_mean(tf.cast(correct_pred, tf.float32)) gives us the average correct predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "with tf.name_scope('Accuracy'):\n",
    "    \n",
    "    predicted_digit = tf.argmax(y_hat, 1)\n",
    "    actual_digit = tf.argmax(Y, 1)\n",
    "    \n",
    "    correct_pred = tf.equal(predicted_digit,actual_digit)\n",
    "    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Summary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also visualize how the loss and accuracy of our model change during several\n",
    "iterations in tensorboard. So, we use tf.summary() to get the summary of the variable.\n",
    "Since the loss and accuracy are scalar variables, we use tf.summary.scalar() to store the summary as shown below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<tf.Tensor 'Loss_1:0' shape=() dtype=string>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf.summary.scalar(\"Accuracy\", accuracy)\n",
    "\n",
    "tf.summary.scalar(\"Loss\", loss)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we merge all the summaries we use in our graph using tf.summary.merge_all(). We merge all summaries because when we have many summaries running and storing them would become inefficient, so we merge all the summaries and run them once in our session instead of running multiple times. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "merge_summary = tf.summary.merge_all()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train the Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now it is time to train our model. As we learned, first we need to initialize all the variables:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "init = tf.global_variables_initializer()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define the batch size and number of iterations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "batch_size = 128\n",
    "num_iterations = 1000"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Start the tensorflow session and perform training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iteration: 0, Loss: 2.30700993538, Accuracy: 0.1328125\n",
      "Iteration: 100, Loss: 1.76781439781, Accuracy: 0.7890625\n",
      "Iteration: 200, Loss: 1.6294002533, Accuracy: 0.8671875\n",
      "Iteration: 300, Loss: 1.56720340252, Accuracy: 0.9453125\n",
      "Iteration: 400, Loss: 1.55666518211, Accuracy: 0.9140625\n",
      "Iteration: 500, Loss: 1.54010999203, Accuracy: 0.9140625\n",
      "Iteration: 600, Loss: 1.54285383224, Accuracy: 0.9296875\n",
      "Iteration: 700, Loss: 1.52447938919, Accuracy: 0.9375\n",
      "Iteration: 800, Loss: 1.50830471516, Accuracy: 0.953125\n",
      "Iteration: 900, Loss: 1.55391788483, Accuracy: 0.921875\n"
     ]
    }
   ],
   "source": [
    "with tf.Session() as sess:\n",
    "\n",
    "    #run the initializer\n",
    "    sess.run(init)\n",
    "\n",
    "    #save the event files\n",
    "    summary_writer = tf.summary.FileWriter('./graphs', graph=tf.get_default_graph())\n",
    "\n",
    "    #train for some n number of iterations\n",
    "    for i in range(num_iterations):\n",
    "        \n",
    "        #get batch of data according to batch size\n",
    "        batch_x, batch_y = mnist.train.next_batch(batch_size)\n",
    "        \n",
    "        #train the network\n",
    "        sess.run(optimizer, feed_dict={\n",
    "            X: batch_x, Y: batch_y\n",
    "            })\n",
    "\n",
    "        #print loss and accuracy on every 100th iteration\n",
    "        if i % 100 == 0:\n",
    "            \n",
    "            #compute loss, accuracy and summary\n",
    "            batch_loss, batch_accuracy,summary = sess.run(\n",
    "                [loss, accuracy, merge_summary], feed_dict={X: batch_x, Y: batch_y}\n",
    "                )\n",
    "\n",
    "            #store all the summaries\n",
    "            summary_writer.add_summary(summary, i)\n",
    "\n",
    "\n",
    "            print('Iteration: {}, Loss: {}, Accuracy: {}'.format(i,batch_loss,batch_accuracy))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you may observe, the loss decreases and the accuracy increases over the training iterations. Now that we have learned how to build the neural network using tensorflow, in the next section we will see how can we visualize the computational graph of our model in tensorboard. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
